Tonal peaks in the spontaneous speech of vantage level Hungarian learners of Spanish

This paper reports on a two-part research project, conducted in order to see how Hungarian learners with at least vantage level of Spanish realize melodic peaks in their Spanish utterances. First, we are focusing on the tonal and distributional characteristics of melodic peaks, taking into consideration the proportion of the rise in f 0 with respect to the previous syllable and examining if the affected syllable is lexically stressed. Second, the range of the tonal rise until the ﬁ rst peak of the utterance is analyzed. The method applied in both cases is Cantero Serena ’ s Prosodic Analysis of Speech (2019), which represents intonation by objectively comparable standardized melodic curves. The differences found in the speech of Hungarian learners as compared to native Spanish speakers have not proved to be signi ﬁ cant in the aspects analyzed here. The main ﬁ nding of the research is that native Spanish speakers tend to realize the ﬁ rst peak of their declarative sentences as the highest f 0 point of the utterance, whereas this is less typical in the oral production of Hungarian learners of Spanish.


INTRODUCTION
The present study deals with the realization of melodic peaks in the spontaneous speech of Hungarian learners of Spanish (HLS), as compared to the native Spanish realization. Dissimilarities are supposed to occur between the native Spanish realization and the one by HLS, as the distribution of lexical stress is different in the two languages, and as Hungarians are reported to produce uncommon stress and melodic patterns according to native Spanish speakers (Baditzn e P alv€ olgyi 2019).
Word stress is the result of the prominence given to a syllable compared to the rest of the syllables in the word (Hualde et al. 2010, 103) by means of changes in the fundamental frequency, intensity or duration with respect to its context (Quilis 1999, 385). Therefore, the three prosodic characteristics that can play a prominent role in accent perception are tone, intensity and duration, but until today there is no complete unanimity in the literature on whether the stressed Spanish syllable is pronounced in a higher tone, with longer duration or with greater intensity compared to its adjacent context. According to Navarro Tom as (1964), the stressed syllable is marked by greater intensity, according to Llisterri et al. (2003), by higher fundamental frequency (f 0 ), and the latter is complemented by longer duration according to Ortega-Llebaria (2006).
In this study I am focusing on the comparison of the intonational aspect of stress realization regardless of the position of the word stress within the utterance. I will analyze two corpora to determine what relative values of tone characterize melodic peaks as compared to the previous syllables, since these prosodic characteristics can only be interpreted in relation to their environment. Though the most comprehensive literature on contemporary Spanish intonation includes works in the autosegmental ToBI framework (Prieto et al. 2010(Prieto et al. -2014 or as part of the AMPER project (Atlas Multimedia de Prosodia del Espacio Rom anico, cf. e.g. Dorta 2013), in the present paper Cantero Serena's approach (2002) is followed (for the reasons of this choice, see Section 3), in which melodic analysis is based on the representation of tonal movements between syllables, expressed in terms of percentages.
Hungarian and Spanish stress systems are radically different in the sense that Hungarian, as opposed to Spanish, has a word-initial fixed position for lexical stress. In Spanish, the location of the stressed syllable is most typically the penultimate one (Delattre 1965), but in lexical words it must fall on one of the last three syllables of the word (the Three Syllable Window Restriction, cf. Alcoba & Murillo 1988, 153). This implies that for Hungarian learners of Spanish, this constitutes a challenging area of L2 language learning, as only disyllabic Spanish words are typically stressed on their first syllables.
In Spanish, melodic peaks characteristically occur on the lexically stressed syllable or the one after it (cf. Face 2001;Beckham et al. 2002). The elements of the intonational contour according to Cantero Serena & Font-Rotch es (2007) are the following: anacrusis, body and final inflection (FI). By anacrusis they mean the unstressed syllables preceding the first peak, which is normally on the first stressed vowel in the contour but can also be displaced to the left or to the right from the first lexical stress. They define as body the syllables between the first peak and the last stressed vowel in the contour (the latter also known as the nucleus), from which the final inflection (with the word 'inflection' referring to 'tonal change') begins, see Fig. 1. There is a parallel between this structure and the classical British division of English intonational phrases into prehead, head (anacrusis), body, nucleus and tail (final inflection), cf. Kingdon (1958), and also with the first boundary tone (anacrusis), pitch accents (body, a H p associated to the syllable bearing the first peak) and the combination of the last pitch accent, the phrase accent and the final boundary tone (final inflection) in the autosegmentalist terminology (cf. Pierrehumbert 1980). Thus, there are two important tonal targets that separate these three parts, the first peak and the nucleus. Typically, the first peak is on the first lexical stress and the body is a continuous descent (declination). The final inflection starts from the nucleus. In declarative sentences, the shape of the intonation contour reminds us of a suspension bridge (cf. Bolinger 1961), in which internal word stresses are not given melodic prominence, only the first and the last ones (figuring as the first peak and the nucleus, respectively), or as a continuously descending melody from the first peak on, with internal melodic peaks on or immediately after stressed syllables (cf. Chela-Flores 2003) (Fig. 2).
The most prominent rise in declarative sentences thus characterizes the first peak, which is itself the highest point of the utterance. In European Spanish, this rise to the first syllable may reach up to 40%, as compared to the utterance-initial syllable and is followed by declination until the last stressed syllable, from where either there is a moderate rise (not reaching more than 15%), or a fall (up to 40%, cf. Fig. 3). Former research has shown that stress in the oral production of even advanced level (B2) Hungarian learners of Spanish is in fact one of the areas most criticized by native Spanish speakers (Baditzn e P alv€ olgyi 2019). As a possible trait of negative transfer, A2-B1 level Hungarian learners of Spanish often realize melodic peaks on the first syllables of lexical words even when in Spanish that syllable would receive no stress (Baditzn e 2018). This is shown in Fig. 4, in which the first syllables of picar [piˈkar] ('to pick up'), comidas [koˈmiðas] ('food-PL') and terminar [termiˈnar] ('to finish') constitute melodic peaks, although they are not stressed in the target language.
Moreover, although in Spanish declarative utterances the initial rise up to the first peak can reach as much as 40% from the utterance-initial syllable, HLS tend to realize the tonal movement up to the first peak with much lower percentage of melodic rise (43.5% of the utterances were even realized without a perceivable rise up to the first stressed syllable, cf. Baditzn e 2018) or even with a fall from the first syllable, cf. Fig. 5.
The present study deals with the tonal peaks in the spontaneous speech of at least Vantage Level (B2 according to the Common European Framework of Reference for Languages, CEFRL, Council of Europe 2001) Hungarian learners of Spanish, in order to test whether B2 level Hungarian learners of Spanish still produce inadequate melodic peaks in their spoken Spanish. Based on what we have presented so far, if B2 level HLS sound unnatural in their stress realization to the Spanish ear when speaking Spanish, in the present study I am examining the following research questions: 1. do HLS realize tonal peaks differently from the native Spanish realization? 2. do HLS realize anacruses differently from the native Spanish realization?
In order to answer the research questions, a corpus of 100 European Spanish and 100 utterances by Hungarian learners of Spanish was compiled and investigated from the point of view of the realization of melodic peaks in declarative utterances.

CORPUS
The native Spanish corpus was obtained from the 'Map Task' activities (in which one speaker has to give instructions to the other how to get from point 'A' to point 'B') from the Interactive Atlas of Romance intonation, compiled by Prieto et al. (2010Prieto et al. ( -2014. It contains only spontaneous speech samples from 16 informants (3 men and 13 women), taken out from recordings with the total duration of 51 min and 23 s. Only monolingual areas were chosen for the analysis, thus leaving out territories such as Catalunya, Valencia or the Balearic Islands (Catalan-speaking zones), Galicia (Galician-speaking zone), the Basque Country and La Rioja (Basque-speaking zones), because these regions could have shown influences of other peninsular languages. The corpus of Hungarian learners of Spanish consists of 16 audio recordings produced in a soundproof room. The total duration was of 80 min 53 s of a Map Task activity, in which 16 speakers (2 men and 14 women) had to inform the interviewer about the correct itinerary. All the Hungarian informants were University students learning Spanish Language and Literature in Budapest, with a B2 level proficiency in Spanish. The following Table 1 sums up the data related to the informants. In both corpora, only declarative utterances of at least three syllables were taken into consideration.

METHODOLOGY
The theoretical approach is based on Cantero Serena (2002) and his Prosodic Analysis of Speech (PAS, Cantero Serena 2019). According to him, intonation must be interpreted strictly as the succession of relevant f 0 variations and it acts at three levelsthe prelinguistic, the linguistic and the paralinguistic one. Marking stress by melodic means is part of the prelinguistic level of intonation. At the prelinguistic level, the only function of intonation is segmentation of speech Acta Linguistica Academica into units, without conveying any additional meaning. This level is the one that is closely connected to 'foreign' accent, as non-native segmentation patterns automatically trigger the perception of speakers as foreigners. At the purely linguistic level, intonation contributes to the meaning of the utterance in the sense that it can express three aspects: whether it is interrogative, 'finished' (not followed immediately by another utterance) or emphatic. Other meanings such as 'rage' or 'irony' are not expressed merely by intonational means, as they pertain to the paralinguistic level of intonation and are complemented by other prosodic devices, such as duration or intensity. Thus, listeners can only identify if the intonation of an utterance is emphatic, not neutral; but whether it expresses rage, for instance, or other emotion, depends on other prosodic or even non-linguistic factors, such as gestures. This study deals with prelinguistic intonational phenomena, such as the presence and the position of melodic peaks in utterances and their relation to stressed syllables. A peak is a tonal movement in the melodic curve with a rise of at least 10% as compared to the previous tonal unit (this is the Spanish threshold of perception, according to Font-Rotch es & Mateo 2011). In order to decide whether a tonal movement reaches this value, we must standardize the melodic curves, so that the contours can be objectively comparable. In the following section the phases of standardization are discussed briefly. In Cantero Serena's PAS model, the f 0 values in case of each syllable are first identified using an acoustic analysis program such as Praat (Boersma & Weenink 2020) and then are standardized in order to obtain objectively comparable melodic patterns.
The first phase of the PAS analysis guarantees that we get rid of irrelevant micromelodic variations, by the reduction of each syllable to a characteristic tonal value. In case of tonal inflections within syllables, the extreme values of f 0 are taken into account. In Fig. 6, we can see that in the utterance Vas a pasar una casita 'You will pass by a small house', the syllable a (a preposition) is characterized by 244 Hz at the beginning, and the f 0 value measured at its endpoint is of 211 Hz. It means that there is a tonal instability within this syllable, so we cannot take its f 0 value measured at the centre of the syllabic nucleus, but the two extreme values must be taken into account.
In this case, both tonal values are represented in the curve, by inserting a dot within the syllable, in order to indicate that there is an inner inflection within the syllable. The standardized contour is represented by a line that starts with an arbitrary value of 100% and anchors in each syllable, which is itself characterized by a percentage based on its tonal position as compared to the previous syllable. If the syllable has a lower f 0 value, it is a negative percentage, and if it is higher than the previous syllable, it is a positive one. In case of our previous utterance, the first syllable va (literally 'go'), with 210 Hz, is given the arbitrary value of 100 in the standardized curve, and the next value (of 244 Hz) is given 116 in the standardized curve, as 244 Hz is 16% higher than 211 Hz (cf. Fig. 7).
Both curves (the absolute one and the standardized copy) are melodically identical, though in order to validate whether the standardized copy sounds the same as the original, it can be synthesized in Praat and submitted to a perceptive test. If correction is needed, it can be realized as a final phase. The standardized curve thus ensures that the described melodies are objectively comparable to each other, regardless of the individual tonal characteristics of the informants; what matters are the proportions of the tonal movements (cf. Cantero Serena & Font-Rotch es 2020).
Contour standardization was first done by using semitones in the 'Dutch School', also known as the IPO model, cf. e.g. 't Hart et al. (1990), followed later by various researches Fig. 6. A spectrogram of the Spanish utterance Vas a pasar una casita 'You will pass by a small house' (the text is my addition), in which the second syllable is characterized by an inner inflection; its highest f 0 value is 244 Hz (at the beginning) and the lowest f 0 value is 211 Hz (at the end) (Adriaens 1991;Beaugendre 1994;Od e & van Heuven 1994). In Spanish, Garrido (1991Garrido ( , 1996 and Estruch et al. (2007) worked with automatic stylization methods. The percentages used by the PAS model can show more than the autosegmental labels by themselves would, as they can also express illocution in some determined cases (in Spanish, for example, an utterance-final rise of over 80% is decoded by listeners as an interrogative pattern).  (2001), also based on percentages and stylized contours, but for them, the first value (100%) is not an arbitrary number, but the first abstract f 0 value of declarative sentences. Yes-no questions start at 80% as compared to this value (Olaszy & Koutny 2001, 182-183). The model has also been applied to the description of the of interlanguage intonation, for instance the Spanish spoken by Brazilians (

RESULTS
Based on what has been discussed so far, the object of this investigation is peak realization in the case of HLS. In other words, the research questions of this study are whether, similarly to B1 students, (1) B2 level HLS still transfer their Hungarian stress patterns to Spanish words, by giving melodic prominence to word-initial syllables, even if they are unstressed in Spanish, and if (2) B2 level Hungarian learners of Spanish realize Spanish stressed syllables with perceivable melodic prominence if the syllable is not word-initial.
In order to answer these questions, let us analyze first the distribution of tonal peaks in both corpora (cf. Table 2): Not surprisingly, in Spanish, the typical peak position in an utterance coincides with the stressed syllable or with the one immediately following it. As we can see in Table 2, there are no considerable differences between the two corpora in the proportion of different peak locations with respect to the position of stress, because Hungarian learners of Spanish produced melodic peaks in almost the same proportion in each category as the native Spanish speakers did. This means that, contrary to what has been attested in former research with B1 level Hungarian learners of Spanish (Baditzn e 2018), Spanish melodic stress patterns are reproduced quite adequately in the case of B2 level students. Also, it is remarkable that the average values of tonal movements immediately to the stressed syllables and also the mean values of tonal movements from the stressed syllable to the immediately following one practically coincide in the two corpora; statistical testing revealed no significant difference in the means (f 0 values in %, with an added constant, were log-transformed for the purposes of statistical testing; in both cases the Mann-Whitney test was applied in SPSS) (Figs 8 and 9). We have not found considerable differences between the two corpora as far as peak location is concerned, but now let us take a closer look at the first peaks within the utterances, the properties of the anacruses. As it has already been mentioned, in Spanish declarative utterances, the melody until the first peak (usually coinciding with the first stressed syllable) is rising, it can reach up to 40%, and the first peak is the highest tonal point of the utterance. Table 3 sums up our results concerning the melodic properties of the anacruses in both corpora.
We can see that in case of HLS, the mean value for the tonal rise in the anacruses is slightly lower than in the native Spanish corpus (cf. also the boxplots in Fig. 10).
In the other results there is no considerable difference (apart from the higher proportion of first peaks on utterance-initial syllables in the case of the HLS, although the proportion is the same in the case of utterance-initial unstressed syllable first peaks), but there is one prominent discrepancy: for native speakers, the highest peak of the utterance is the first one (cf. Fig. 11), but HLS do not seem to follow this tendency. This means that apparently B2 level HLS do not realize tonal peaks with considerable differences compared to native Spanish speakers apart from one melodic aspect: their first peaks are not located at the highest point of the utterance and the anacrusis is also characterized by a lower rise. Fig. 9. Boxplots representing the tonal movements from the stressed syllables (extreme outliers, < Q1 À 3 * IQR and > Q3 þ 3 * IQR, were removed from the plot). Mann-Whitney U 5 43,449.5, Z 5 À1.236, P 5 0.217 Fig. 8. Boxplots representing the tonal movements to the stressed syllables (extreme outliers, < Q1 À 3 * IQR and > Q3 þ 3 * IQR, were removed from the plot). Mann-Whitney U 5 38,245, Z 5 À0.296, P 5 0.768 Height of the first peak Spanish HLS The first peak is the highest point of the utterance 45 17

CONCLUSIONS AND DISCUSSION
Our results indicate that in the case of at least B2 level Hungarian learners of Spanish, stressed syllables are realized in the same way from the perspective of intonation as in the native Spanish corpus. The only considerable difference occurs regarding the first peaks: the melody until the first peak rises higher in the native Spanish corpus, and the first peak is characteristically the highest melodic point of the whole utterance. Evidently, a larger corpus and further intonational parameters to analyze would help us specify which well-defined prosodic cues contribute to the unnatural stress realization of HLS. If HLS sound odd to native Spanish speakers from the point of view of stress realization, it might not only be because of inadequate melodic realization (in our case, especially on the first peak), but rather because of other prosodic devices (such as intensity or duration) that may constitute a further area of negative linguistic transfer. If the differences are explainable only by melodic cues, the non-native realization may not be the result of merely atypical stress alignment. According to the investigations of Barnes et al. (2012), apart from alignment, the shape of the melody until the f 0 turning point also influences listeners' perception.

ACKNOWLEDGMENTS
Supported by the UNKP-20-5 New National Excellence Program of the Ministry for Innovation and Technology and by the J anos Bolyai Research Scholarship of the Hungarian Academy of Sciences.
I am sincerely grateful for all the valuable suggestions made by my reviewers that helped me to improve the quality of the paper. I am particularly indebted to Lena Borise for her insightful Fig. 11. Typical Spanish declarative utterance with the first peak at the highest point of the melodic curve in the sentence Y sigue de frente hasta que veas un bar que se llama Marina 'And go on in front of you till you see a bar called Marina'